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ABSTRACT 

We  quantify  the  variability  offaint  unresolved  optical  sources  using  a  catalog  based  on  multiple  SDSS  imaging  ob¬ 
servations.  The  catalog  covers  SDSS  stripe  82,  which  lies  along  the  celestial  equator  in  the  southern  Galactic  hemi¬ 
sphere  (22h24ra  <  aj2ooo.o  <  04h08m,  — 1.27°  <  feooo.o  <  +1.27°,  ~290  deg2),  and  contains  34  million  photometric 
observations  in  the  SDSS  ugriz  system  for  748,084  unresolved  sources  at  high  Galactic  latitudes  ( b  <  —20°)  that  were 
observed  at  least  four  times  in  each  of  the  ugri  bands  (with  a  median  of  1 0  observations  obtained  over  ~6  yr).  In  each 
photometric  bandpass  we  compute  various  low-order  light-curve  statistics,  such  as  rms  scatter,  \2  per  degree  of  free¬ 
dom,  skewness,  and  minimum  and  maximum  magnitude,  and  use  them  to  select  and  study  variable  sources.  We  find  that 
2%  of  unresolved  optical  sources  brighter  than  g  =  20.5  appear  variable  at  the  0.05  mag  level  (rms)  simultaneously  in 
the  g  and  r  bands  (at  high  Galactic  latitudes).  The  majority  (2  out  of  3)  of  these  variable  sources  are  low-redshift  (<2) 
quasars,  although  they  represent  only  2%  of  all  sources  in  the  adopted  flux-limited  sample.  We  find  that  at  least  90% 
of  quasars  are  variable  at  the  0.03  mag  level  (rms)  and  confirm  that  variability  is  as  good  a  method  for  finding  low- 
redshift  quasars  as  the  UV  excess  color  selection  (at  high  Galactic  latitudes).  We  analyze  the  distribution  of  light-curve 
skewness  for  quasars  and  find  that  it  is  centered  on  zero.  We  find  that  about  one-fourth  of  the  variable  stars  are  RR 
Lyrae  stars,  and  that  only  0.5%  of  stars  from  the  main  stellar  locus  are  variable  at  the  0.05  mag  level.  The  distribution 
of  light-curve  skewness  in  the  g  —  r  versus  u  —  g  color-color  diagram  on  the  main  stellar  locus  is  found  to  be  bimodal 
(with  one  mode  consistent  with  Algol-like  behavior).  Using  over  600  RR  Lyrae  stars,  we  demonstrate  rich  halo  sub¬ 
structure  out  to  distances  of  100  kpc.  We  extrapolate  these  results  to  the  expected  performance  by  the  Large  Synoptic 
Survey  Telescope  and  estimate  that  it  will  obtain  well-sampled,  2%  accurate,  multicolor  light  curves  for  ^2  million  low- 
redshift  quasars  and  discover  at  least  50  million  variable  stars. 

Key  words:  Galaxy:  halo  —  Galaxy:  stellar  content  —  quasars:  general  —  stars:  Population  II  — 
stars:  variables:  other 
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1.  INTRODUCTION 

Variability  is  an  important  phenomenon  in  astrophysical  studies 
of  structure  and  evolution,  both  stellar  and  galactic.  Some  variable 
stars,  such  as  RR  Lyrae  stars,  are  an  excellent  tool  for  studying  the 
Galaxy.  Being  nearly  standard  candles  (thus  making  distance  de¬ 
termination  relatively  straightforward)  and  being  intrinsically  bright, 
they  are  a  particularly  suitable  tracer  of  Galactic  structure.  In  extra- 
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galactic  astronomy,  the  optical  continuum  variability  of  quasars 
is  utilized  as  an  efficient  method  for  their  discovery  (van  den  Bergh 
et  al.  1973;  Hawkins  1983;  Koo  et  al.  1986;  Hawkins  &  Veron 
1995)  and  is  also  frequently  used  to  constrain  the  origin  of  their 
emission  (Kawaguchi  et  al.  1998;  Trevese  et  al.  2001;  Martini  & 
Schneider  2003). 

Despite  the  importance  of  variability,  the  variable  optical  sky 
remains  largely  unexplored  and  poorly  quantified,  especially  at 
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the  faint  end.  To  what  degree  different  variable  populations  con¬ 
tribute  to  the  overall  variability,  how  they  are  distributed  in  magni¬ 
tude  and  color,  and  what  the  characteristic  timescales  and  dominant 
mechanisms  of  variability  are,  are  just  some  of  the  questions  that 
still  remain  to  be  answered.  To  address  these  questions,  several 
contemporary  projects  aimed  at  regular  monitoring  of  the  optical 
sky  were  started.  Some  of  the  more  prominent  surveys  in  terms 
of  sky  coverage,  depth,  and  cadence  are  as  follows: 

1.  The  Faint  Sky  Variability  Survey  (Groot  et  al.  2003)  is  a 
very  deep  (17  <  V  <  24  )  BVI  survey  of  23  deg2  of  sky,  contain¬ 
ing  about  80,000  sources  sampled  at  timescales  ranging  from 
minutes  to  years. 

2.  The  QUEST  Survey  (Vivas  et  al.  2004)  monitors  700  deg2 
of  sky  from  V  =  13.5  to  a  limit  of  V  =  21. 

3.  ROTSE-I  (Akerlof  et  al.  2000)  monitors  the  entire  ob¬ 
servable  sky  twice  a  night  from  V  =  10  to  a  limit  of  V  =  15.5. 
The  Northern  Sky  Variability  Survey  (Wozniak  et  al.  2004)  is 
based  on  ROTSE-I  data. 

4.  The  All  Sky  Automated  Survey  (Pojmanski  2002)  mon¬ 
itors  the  entire  southern  and  part  of  the  northern  sky  ( 6  <  25°)  to 
a  limit  of  V  =  15. 

5.  OGLE  (OGLE  II;  Udalski  et  al.  2002)  monitors  -100  deg2 
toward  the  Galactic  bulge  from  /=  11.5  to  a  limit  of  I  =  20. 
Due  to  the  very  high  stellar  density  toward  the  bulge,  OGLE  II 
has  detected  about  270,000  variable  stars  (Wozniak  et  al.  2002; 
Zebrun  et  al.  2001). 

6.  The  MACHO  Project  monitored  the  brightness  of —60  mil¬ 
lion  stars  in  —90  deg2  of  sky  toward  the  Magellanic  Clouds  and 
the  Galactic  bulge  for  —7  yr  to  a  limit  of  V  —  24  (Alcock  et  al. 
2001). 

A  comprehensive  review  of  past  and  ongoing  variability  surveys 
can  be  found  in  Becker  et  al.  (2004). 

Recognizing  the  outstanding  importance  of  variable  objects, 
the  last  Decadal  Survey  Report  (National  Research  Council  2001) 
highly  recommended  a  major  new  initiative  for  studying  the  var¬ 
iable  sky:  the  large  survey  telescope.  The  two  most  ambitious  pro¬ 
posals  for  such  a  telescope  are  the  Pan-STARRS  project  (Kaiser 
et  al.  2002)  and  the  Large  Synoptic  Survey  Telescope21  (LSST; 
Tyson  2002;  Walker  2003).  The  initial  version  of  Pan-STARRS, 
with  the  first  1.8  m  telescope  (four  are  planned),  has  already  had 
its  first  light,  and  the  8.4  m  LSST  will  have  its  first  light  in  2014 
(if  approved  for  construction  in  2009). 

LSST  will  offer  an  unprecedented  view  of  the  faint  variable 
sky;  according  to  the  current  designs  it  will  scan  the  entire  acces¬ 
sible  sky  every  three  nights  to  a  limit  of  V  —  25  with  two  ob¬ 
servations  per  night  in  two  different  bands  (  selected  from  a  set  of 
six).  One  of  the  LSST  science  goals22  will  be  the  exploration  of 
the  transient  optical  sky:  the  discovery  and  analysis  of  rare  and 
exotic  objects  (e.g.,  neutron  star  and  black  hole  binaries),  gamma- 
ray  bursts,  X-ray  flashes,  and  new  classes  of  transients,  such  as 
binary  mergers  and  stellar  disruptions  by  black  holes.  The  ob¬ 
served  volume  of  space,  and  the  requirement  to  recognize  and 
monitor  these  events  in  real  time  on  a  “normally”  variable  sky, 
will  present  a  challenge  to  the  project. 

Since  LSST  will  utilize23  the  Sloan  Digital  Sky  Survey  (SDSS; 
York  et  al.  2000)  photometric  system  ( ugriz;  Fukugita  et  al.  1996), 
multiple  photometric  observations  obtained  by  the  SDSS  rep¬ 
resent  an  excellent  data  set  for  a  pre-LSST  study  that  charac- 


21  See  http://www.lsst.org. 

22  For  more  details,  see  http://www.lsst.org/Science/science_goals.shtml. 

23  LSST  will  also  use  the  7  band  at  ~1  //m.  For  more  details,  see  the  LSST 
Science  Requirements  Document  at  http://www.lsst.org/Science/lsst_baseline.shtml. 


terizes  the  faint  variable  sky  and  quantifies  the  variable  popu¬ 
lation  and  its  distribution  in  magnitude-color-variability  space. 
Here  we  present  such  a  study  of  unresolved  sources  in  a  region  that 
has  been  imaged  multiple  times  by  the  SDSS. 

In  §  2  we  give  a  brief  overview  of  the  SDSS  imaging  survey 
and  repeated  scans  of  an  —290  deg2  region  called  “stripe  82.”  In 
§  3  we  describe  methods  used  to  select  candidate  variable  sources 
from  the  SDSS  stripe  82  data  assembled,  averaged,  and  recali¬ 
brated  by  Ivezic  et  al.  (2007)  and  present  tests  that  show  the  ro¬ 
bustness  of  the  adopted  selection  criteria.  In  the  same  section  we 
discuss  the  distribution  of  selected  variable  sources  in  magnitude- 
color-variability  space.  The  Milky  Way  halo  structure  traced  by 
selected  candidate  RR  Lyrae  stars  is  discussed  in  §  4,  and  in  §  5 
we  estimate  the  fraction  of  variable  quasars.  Implications  for  sur¬ 
veys  such  as  the  LSST  are  discussed  in  §  6,  and  our  main  results 
are  summarized  in  §  7. 

2.  OVERVIEW  OF  THE  SDSS  IMAGING 
AND  STRIPE  82  DATA 

The  quality  of  the  photometry  and  astrometry,  as  well  as  the 
large  area  covered  by  the  survey,  makes  the  SDSS  stand  out  among 
available  optical  sky  surveys  (Sesar  et  al.  2006).  The  SDSS  pro¬ 
vides  homogeneous  and  deep  (r  <  22.5)  photometry  in  five  band- 
passes  (u,  g,  r,  i,  andr;  Gunn  et  al.  1998,  2006;  Hogg  et  al.  2001; 
Smith  et  al.  2002;  Tucker  et  al.  2006)  accurate  to  0.02  mag  (mis 
scatter)  for  unresolved  sources  not  limited  by  photon  statistics 
(Scranton  et  al.  2002;  Ivezic  et  al.  2003)  and  with  a  zero-point 
uncertainty  of  0.02  mag  (Ivezic  et  al.  2004a).  The  survey  sky  cov¬ 
erage  of  1 0,000  deg2  in  the  northern  Galactic  cap  and  300  deg2  in 
the  southern  Galactic  cap  results  in  photometric  measurements 
for  well  over  100  million  stars  and  a  similar  number  of  galaxies 
(Stoughton  et  al.  2002).  The  recent  Data  Release  5  (Adelman- 
McCarthy  et  al.  2007)24  lists  photometric  data  for  215  million 
unique  objects  observed  in  8000  deg2  of  sky  as  part  of  the  “SDSS-I” 
phase  that  ran  through  2005  June.  Astrometric  positions  are  ac¬ 
curate  to  better  than  0.1"  per  coordinate  (rms)  for  sources  with 
r  <  20.5  (Pier  et  al.  2003),  and  the  morphological  information 
from  the  images  allows  reliable  star-galaxy  separation  to  r  —  2 1 .5 
(Lupton  et  al.  2002).  In  addition,  the  five-band  SDSS  photometry 
can  be  used  for  very  detailed  source  classification,  e.g.,  separa¬ 
tion  of  quasars  and  stars  (  Richards  et  al.  2002),  spectral  classi¬ 
fication  of  stars  to  within  one  to  two  spectral  subtypes  ( Lenz  et  al. 
1998;  Finlator  et  al.  2000;  Hawley  et  al.  2002),  and  even  remark¬ 
ably  efficient  color  selection  of  horizontal-branch  and  RR  Lyrae 
stars  ( Yanny  et  al.  2000;  Sirko  et  al.  2004;  Ivezic  et  al.  2005)  and 
low-metallicity  G  and  K  giants  (Helmi  et  al.  2003). 

The  equatorial  stripe  82  region  (22h24m  <  a^ooo.o  <  04h08m, 
—  1.27°  <  ((12000.0  <  +1.27°,  —290  deg2)  from  the  southern 
Galactic  cap  (—64°  <  b  <  —20°)  presents  a  valuable  data  source 
for  variability  studies.  The  region  was  repeatedly  observed  (58 
imaging  runs  from  1998  September  to  2004  December,  but  not 
all  cover  the  entire  region),  and  it  is  the  largest  source  of  multi¬ 
epoch  data  in  the  SDSS-I  phase.  Observations  are  fairly  ho¬ 
mogeneously  distributed  over  the  6  yr  period,  with  one  to  two 
observations  (separated  by  about  a  week,  on  average)  obtained 
every  fall.  A  histogram  of  the  number  of  observations  per  star  is 
shown  in  Figure  1  in  Ivezic  et  al.  (2007).  Another  source  for  the 
large  number  of  scans  is  the  SDSS-I1  Supernova  Survey  (J.  A. 
Frieman  et  al.  2007,  in  preparation).  The  SDSS-II  Supernova 
Survey  scans  at  least  once  a  week  during  4  month  long  seasons, 
and  two  out  of  three  planned  seasonal  campaigns  are  already 


24  See  http://www.sdss.org/dr5. 
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completed  (the  SDSS-II  Supernova  Survey  data  were  not  used  in 
this  work).  By  averaging  the  repeated  observations  of  stripe  82 
sources,  more  accurate  photometry  than  the  nominal  0.02  mag 
single-scan  accuracy  can  be  achieved,  as  demonstrated  by  Ivezic 
et  al.  (2007),  who  produced  a  catalog  of  recalibrated,  co-added 
stripe  82  observations.  The  catalog  lists  58  million  photometric 
observations  for  1.4  million  unresolved  sources  that  were  ob¬ 
served  at  least  four  times  in  each  of  the  gri  bands  (with  a  median 
of  10  and  a  maximum  of  28  observations  obtained  over  ~6  yr). 
The  random  photometric  errors  for  PSF  (point-spread  function) 
magnitudes  are  below  0.01  mag  for  stars  brighter  than  19.5,20.5, 
20.5,  20,  and  18.5  in  ugriz,  respectively  (about  twice  as  accurate 
for  individual  SDSS  runs),  and  the  spatial  variation  of  photometric 
zero  points  is  not  larger  than  ^0.01  mag  (mis).  Following  Ivezic 
et  al.  (2007),  we  use  PSF  magnitudes  because  they  go  deeper  at  a 
given  signal-to-noise  ratio  than  aperture  magnitudes  and  have 
more  accurate  photometric  error  estimates  than  model  magnitudes. 
In  addition,  various  low-order  statistics,  such  as  mis  scatter  (E), 
X2  per  degree  offfeedom  (x2),  light-curve  skewness  (7),  and  min¬ 
imum  and  maximum  PSF  magnitude,  were  computed  for  each 
ugriz  band  and  each  source.  We  compute  x2  per  degree  of  free¬ 
dom  as 


2  _  1  y-  ( xi  —  ix))2 

& 

and  light-curve  skewness  7  as25 

=  »2  ih 

7  (n-  l)(n-2)E3  ’ 


(1) 


(2) 


ill  =-^h(xi- {x)f,  (3) 

n  z — i 

1= 1 

S  =  \/)73T  (4) 

where  n  is  the  number  of  detections,  x,  is  the  magnitude,  ( x )  is 
the  mean  magnitude,  and  £,•  is  the  photometric  error. 

Separation  of  quasars  and  stars,  as  well  as  efficient  color  se¬ 
lection  of  horizontal-branch  and  RR  Lyrae  stars,  depends  on 
accurate  M-band  photometry.  To  ensure  this,  we  select  748,084  un¬ 
resolved  sources  from  the  Ivezic  et  al.  (2007)  catalog  with  at  least 
four  detections  in  the  u  band.  Although  this  cut  reduces  the  initial 
sample  by  about  a  factor  of  2,  it  does  not  introduce  a  significant 
bias  in  the  selection  of  candidate  variable  sources,  as  we  show  in 
§3. 


3.  ANALYSIS  OF  THE  STRIPE  82  CATALOG 
OF  VARIABLE  SOURCES 

In  this  section  we  describe  methods  for  selecting  candidate 
variable  sources  and  present  tests  that  show  the  robustness  of  the 
adopted  selection  criteria.  The  distribution  of  selected  variable 
sources  in  magnitude-color-variability  space  is  also  presented. 

3.1.  Methods  and  Selection  Criteria 

Due  to  a  relatively  small  number  of  observations  per  source 
and  random  sampling,  we  do  not  perform  light-curve  fitting  but 
instead  use  low-order  statistics  to  select  candidate  variables  and 
study  their  properties.  There  are  four  parameters  (median  PSF 


25  We  use  equations  from  http://www.xycoon.com/skewness_maM_sample_test_  1 
.htm. 


magnitude,  rms  scatter  E,  x2,  and  light-curve  skewness  7)  mea¬ 
sured  in  five  photometric  bands  (u,  g,  r,  i,  and  z),  for  a  total  of  20 
parameters.  In  the  analysis  presented  here,  we  utilize  eight  of 
them,  as  follows: 

1.  Median  PSF  magnitudes  in  the  ugr  bands  (corrected  for 
interstellar  extinction  using  the  map  from  Schlegel  et  al.  1998) 
because  the  g  —  r  versus  u  —  g  color-color  diagram  has  the  most 
classification  power  (e.g.,  Smolcic  et  al.  2004  and  references 
therein). 

2.  E  and  x2  in  the  g  and  r  bands. 

3.  Light-curve  skewness  7 (g)  (the  g  band  combines  a  high 
signal-to-noise  ratio  and  large  variability  amplitude  for  the  ma¬ 
jority  of  variable  sources). 

The  observed  rms  scatter  E  includes  both  the  intrinsic  vari¬ 
ability  <7  and  the  mean  photometric  error  (£(/«))  as  a  function  of 
magnitude.  The  dependence  of  E  on  magnitude  in  the  ugrz  bands 
is  shown  inFigure  1.  For  sources  brighter  than  18, 19.5, 19.5, 19, 
and  17.5  mag  in  ugriz,  respectively,  the  SDSS  delivers  2%  pho¬ 
tometry  with  little  or  no  dependence  on  magnitude.  We  deter¬ 
mine  (£(/«))  by  fitting  a  fourth-degree  polynomial  to  median  E 
values  in  0.5  mag  wide  bins  (here  we  assume  that  the  majority  of 
sources  are  not  variable).  The  theoretically  expected  (£(»?))  func¬ 
tion  (Strateva  et  al.  2001), 

<£(i»))  =  a  +  MO04'"  +  cl0°'8m,  (5) 

provides  equally  good  fits.  We  define  the  intrinsic  variability  o 
(  hereafter  rms  scatter  a )  as 


a  = 


E2 


(£('«))2 


(6) 


for  E  >  (£(»?))  and  <7  =  0  otherwise. 

As  the  first  variability  selection  criterion,  we  adopt  a( g)  >  0.05 
and  a(r)  >  0.05  mag  [hereafter  written  as  a(g,  r)  >  0.05  mag]. 
At  the  bright  end,  this  criterion  is  equivalent  to  selecting  sources 
with  rms  scatter  greater  than  2. 5 00,  where  <7o  =  0.02  mag  is  the 
measurement  noise.  Selection  cuts  are  applied  simultaneously  in 
the  g  and  r  bands  to  reduce  the  number  of  “false  positives”  (in¬ 
trinsically  nonvariable  sources  selected  as  candidate  variable 
sources  due  to  measurement  noise).  About  6%  of  sources  pass 
the  (7  cut  in  each  band  separately,  and  ~3%  of  sources  pass  the  cut 
in  both  bands  simultaneously.  By  selecting  sources  with  a(g ,  r)  > 
0.05  mag,  we  also  select  faint  sources  that  have  large  a  due  to 
large  photometric  errors  at  the  faint  end.  To  only  select  faint  sources 
with  statistically  significant  rms  scatter,  we  apply  the  x2  test  as 
the  second  selection  cut. 

In  the  x2  test,  the  value  of  x2  per  degree  offfeedom  (calculated 
with  respect  to  a  weighted  mean  magnitude  and  using  errors  com¬ 
puted  by  the  photometric  pipeline)  determines  whether  the  ob¬ 
served  light  curve  is  consistent  with  the  Gaussian  distribution  of 
errors.  Large  x2  values  show  that  the  rms  scatter  is  inconsistent 
with  random  fluctuations.  Ivezic  et  al.  (2003,  2007)  used  multi¬ 
epoch  SDSS  observations  to  show  that  the  photometric  error 
distribution  in  the  SDSS  roughly  follows  a  Gaussian  distribu¬ 
tion.  A  comparison  of  x2  distributions  in  the  g  and  r  bands  with  a 
reference  Gaussian  x2  distribution  is  shown  in  Figure  2.  As  is 
evident,  x2  distributions  in  both  bands  roughly  follow  the  ref¬ 
erence  Gaussian  x2  distribution  for  y2  <  I  ■  demonstrating  that 
median  photometric  errors  are  correctly  determined.  The  discrep¬ 
ancy  for  larger  y2  is  due  to  variable  sources  rather  than  non- 
Gaussian  error  distributions,  as  we  demonstrate  below. 


Fig.  1 . — Dependence  of  the  median  rms  scatter  E  in  SDSS  ugrz  bands  on  magnitude  {symbols).  The  vertical  bars  show  the  rms  scatter  of  E  in  each  bin  (not  the  error  of 
the  median).  The  dependence  of  E  in  the  i  band  is  similar  to  the  r-band  dependence.  In  each  band  a  fourth-degree  polynomial  is  fitted  through  the  medians  {solid  line).  [ See 
the  electronic  edition  of  the  Journal  for  a  color  version  of  this  figure .] 
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Fig.  2. —  Top  panels :  Cumulative  distribution  of  \2  9  and  r  values  for  all  sources  {solid  line)  and  a  reference  Gaussian  yf  distribution  with  9  degrees  of  freedom  {dashed 
line).  Dashed  lines  show  adopted  selection  cuts  on  x2{g)  and  X2(r)  values.  Middle  panels :  Fraction  of  a{g,  r)  >  0.05  mag  sources  with  yf  per  degree  of  freedom  greater 
than  x2  (only  in  the  g  or  r  band,  solid  line;  in  both  the  g  and  r  bands,  dashed  line).  Bottom  panels :  Fraction  of  a{g ,  r)  >  0.05  mag  sources  with  x2(m)  >  2  {dashed  line)  or 
X2(m)  —  3  (solid  line)  as  a  function  of  magnitude  for  the  m  —  g,r  bands,  respectively.  [See  the  electronic  edition  of  the  Journal  for  a  color  version  of  this  figure .] 
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The  second  selection  cut,  x2(9)  >  3  and  y2(r)  >  3  [hereafter 
written  as  x2(g,  r)  >  3],  selects  ~90%  of  cr(g,r)  >  0.05  mag 
sources,  as  shown  in  Figure  2  ( middle  panels ).  The  effectiveness 
of  the  x2  test  is  demonstrated  in  Figure  2  ( bottom  panels).  For 
magnitudes  fainter  than  g  =  20.5,  the  fraction  of  candidate  vari¬ 
ables  decreases  as  photometric  errors  increase.  The  selection  is 
relatively  uniform  for  sources  brighter  than  g  =  20.5,  and  we 
adopt  this  value  as  the  flux  limit  for  the  selected  variable  sample. 

There  are  662,195  sources  brighter  than  g  =  20.5  in  the  full 
sample.  Using  a (g,r)  >  0.05  mag  and  x2(g,  r)  >  3  as  the  se¬ 
lection  criteria,  we  select  13,051  candidate  variable  sources.26 
Therefore,  at  least  2%  of  unresolved  optical  sources  brighter 
than  g  =  20.5  appear  variable  at  the  >0.05  mag  level  ( rms ) 
simultaneously  in  the  g  and  r  bands  (at  high  Galactic  latitudes). 

The  required  minimum  number  of  M-band  detections  (four) 
imposed  on  the  initial  sample  rejects  972  sources  (or  ~7%  of  the 
variable  sample).  The  majority  of  these  sources  are  redder  than 
g  —  r  ~  1  and  are  very  faint  in  the  u  band  (u  ~  22).  Our  results 
and  conclusions  presented  below  do  not  change  if  these  sources 
are  included. 

We  also  investigate  how  the  fraction  of  selected  variable  sources 
changes  as  a  function  of  the  minimum  required  number  of  obser¬ 
vations  by  requiring  a  minimum  of  eight  detections  in  the  ugr  bands. 
The  fraction  of  selected  variable  sources  remains  at  2%.  We  con¬ 
clude  that  the  fraction  of  selected  variable  sources  does  not  de¬ 
pend  strongly  on  the  minimum  required  number  of  observations. 

The  fraction  of  selected  variable  sources  depends  on  the  stellar 
density  because  the  number  of  stars  increases  at  lower  Galactic 
latitudes  (see  Fig.  5  in  Ivezic  et  al.  2007),  while  the  quasar  counts 
remain  the  same.  The  detailed  makeup  of  the  variable  stellar  pop¬ 
ulation  presumably  depends  on  Galactic  coordinates  due  to  a  vary¬ 
ing  stellar  population  mix. 

3.2.  The  Counts  of  Variable  Sources 

In  this  section  we  estimate  the  completeness  and  efficiency  of 
the  candidate  variable  sample  and  discuss  the  dependence  of  counts, 
rms  scatter,  a(g)/a(r)  ratio,  and  light-curve  skewness  7 (g)  on  po¬ 
sition  in  the  g  —  r  versus  u  —  g  color-color  diagram. 

3.2.1.  Completeness 

The  selection  completeness,  defined  as  the  fraction  of  true  var¬ 
iable  sources  recovered  by  the  algorithm,  depends  on  the  light- 
curve  shape  and  amplitudes.  Due  to  a  fairly  large  number  of 
observations  (median  of  1 0)  and  small  a(g ,  r)  cutoff  compared  to 
typical  amplitudes  of  variable  sources  (e.g.,  most  RR  Lyrae  stars 
and  quasars  have  peak-to-peak  amplitudes  of  ~  1  mag),  we  expect 
the  completeness  to  be  fairly  high  for  RR  Lyrae  stars  095%; 
see  §  4)  and  quasars  090%;  see  §  5).  The  completeness  for  other 
types  of  variable  sources,  such  as  flares  and  eclipsing  binaries,  is 
hard  to  estimate  but  is  probably  low  due  to  sparse  sampling. 

3.2.2.  Efficiency 

The  selection  efficiency,  defined  as  the  fraction  of  true  variable 
sources  in  the  candidate  variable  sample,  determines  the  robust¬ 
ness  of  the  selection  algorithm.  The  main  diagnostic  for  the  ro¬ 
bustness  of  the  adopted  selection  criteria  is  the  distribution  of 
selected  candidates  in  the  SDSS  color-magnitude  and  color-color 
diagrams.  The  position  of  a  source  in  these  diagrams  is  a  good 
proxy  for  its  spectral  classification  (Lenz  et  al.  1998;  Fan  1999; 
Finlator  et  al.  2000;  Smolcic  et  al.  2004). 


26  This  list  of  candidate  variable  sources  is  publicly  available  from  http:// 
www.sdss.org/dr5/products/value_added/index.html. 


Figure  3  compares  the  distribution  of  candidate  variable  sources 
to  that  of  all  sources  in  the  g  —  r  versus  u  —  g  color-color  diagram. 
Were  the  selection  a  random  process,  the  selected  candidates 
would  have  the  same  distribution  as  the  full  sample.  The  distri¬ 
butions  of  candidate  variables  and  of  the  full  sample  are  remark¬ 
ably  different,  demonstrating  that  the  candidate  variables  are  not 
randomly  selected  from  the  parent  sample,  and  therefore  sug¬ 
gesting  high  selection  efficiency. 

3.2.3.  Dominant  Classes  of  Variable  Objects 

The  three  dominant  classes  ofvariable  objects  are  quasars,  RR 
Lyrae  stars,  and  stars  from  the  main  stellar  locus.  The  most  ob¬ 
vious  difference  between  the  variable  and  full  sample  distribu¬ 
tions  is  a  much  higher  fraction  of  low-redshift  quasars  (<2.2, 
recognized  by  their  UV  excess,  u  —  g  <  0.7;  see  Richards  et  al. 
2002)  and  RR  Lyrae  stars(w  —  5  ~  1.15,  g  —  r<  0.3;  see  Ivezic 
et  al.  2005)  in  the  variable  sample,  vividly  shown  in  Figure  3 
{bottom). 

Another  interesting  feature  visible  in  Figure  3  (bottom)  is  a 
gradient  in  the  fraction  of  variable  main  stellar  locus  stars  (per¬ 
pendicular  to  the  main  stellar  locus).  We  investigate  this  gradient 
by  first  defining  principal  colors, 

Pi  =  0.91z t  -  0.495#  -  0.4 15r  -  1.28  (7) 

and 

s  =  -0.249m  +  0.794 g  -  0.555?-  +  0.234,  (8) 

where  P \  and  s  are  the  principal  axis  parallel  and  perpendicular 
to  the  main  stellar  locus,  respectively  (Ivezic  et  al.  2004a).  The 
s  color  is  a  measure  of  metallicity  (Lenz  et  al.  1998),  and  s  > 
0.05  stars  are  expected  to  be  metal-poor  (Helmi  et  al.  2003). 
Sources  with  r  <  19  and  0  <  Pi  <  0.9  are  selected  and  binned 
in  four  s  bins.  For  each  bin  we  calculate  the  fraction  of  sources 
with  a(g)  >  0.05  mag,  the  fraction  of  variable  sources  [selected 
with  a(g,  r)  >  0.05  mag  and  y2(#,  r)  >  3],  the  median  a (g),  and 
the  total  number  of  sources  in  the  bin  (see  Table  1).  A  greater 
fraction  of  variable  sources  in  the  last  bin  (s  >  0.06)  indicates 
that,  on  average,  metal-poor  main  stellar  locus  stars  are  more 
variable  than  metal-rich  stars.  We  speculate  that  this  increased 
variability  could  be  because  this  sample  of  metal-poor  stars  is 
expected  to  have  a  high  fraction  of  giants. 

In  order  to  quantify  the  differences  between  the  full  and  var¬ 
iable  samples,  we  follow  Sesar  et  al.  (2006)  and  divide  the  g  —  r 
versus  u  —  g  color-color  diagram  into  six  characteristic  regions, 
each  dominated  by  a  particular  type  of  source,  as  shown  in  Fig¬ 
ure  4.  The  fractions  and  counts  of  variable  sources  and  all  sources 
in  each  region  are  listed  in  Table  2  for  g  <  19,  g  <  20.5,  and 
g  <22  flux-limited  samples.  Notably,  in  the  adopted  g  <  20.5 
flux  limit,  the  fraction  of  region  II  sources  (dominated  by  low- 
redshift  quasars)  in  the  variable  sample  is  63%,  or  ^30  times 
greater  than  the  fraction  of  region  II  sources  in  the  full  sample 
(~2%).  The  fraction  of  region  IV  sources  (which  include  RR  Lyrae 
stars)  in  the  variable  sample  is  also  high  when  compared  to  the 
full  sample  (~6  times  higher). 

As  shown  in  Table  2,  in  the  g  =  20.5  flux-limited  sample  we 
find  that  low-redshift  quasars  and  RR  Lyrae  stars  (i.e.,  regions  II 
and  IV)  make  up  70%  of  the  variable  population  while  repre¬ 
senting  only  3%  of  all  sources.  Quasars  alone  account  for  63%  of 
the  variable  population.  Stars  from  the  main  stellar  locus  represent 
95%  of  all  sources  and  25%  of  the  variable  sample;  about  0.5% 
of  the  stars  from  the  locus  are  variable  at  the  >0.05  mag  level. 
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Fig.  3. — Distribution  of  counts  for  the  full  sample  {top)  and  candidate  variable 
sample  {middle),  and  the  ratio  of  the  two  counts  ( bottom )  in  the  g  —  r  vs.  u  —  g 
color-color  diagram  for  sources  brighter  than  g  =  20.5,  binned  in  0.05  mag  bins. 
Contours  outline  distributions  of  unbinned  counts.  Note  the  remarkable  differ¬ 
ence  between  the  distribution  of  all  sources  and  that  of  the  variable  sample,  which 
demonstrates  that  the  latter  are  robustly  selected. 


TABLE  1 

The  Fraction  of  Variable  Main  Stellar  Locus  Stars 
as  a  Function  of  the  s  Color 


Bin  %  o{g)  >  0.05a  %  Variable15  {o{g))c  Counts'* 


s  <  —0.02  .  3.23  0.36  0.017  46836 

—0.02  <  s  <  0.02 .  2.92  0.28  0.017  136910 

0.02  <  0.06 .  4.61  1.18  0.019  29106 

s  >  0.06 .  11.50  4.10  0.027  4547 


a  Fraction  of  sources  with  o{g)  >  0.05  mag. 

b  Fraction  of  variable  sources  [selected  using  cr(g,r)>  0.05  mag  and 

>  3]. 

°  Median  cr(g). 

d  Number  of  sources  in  the  bin. 


3.3.  The  Properties  of  Variable  Sources 

Various  light-curve  properties,  such  as  shape  and  amplitude, 
are  expected  to  be  correlated  with  stellar  types.  In  this  section  we 
study  the  distribution  of  the  nns  scatter  in  the  u  and  g  bands  and  the 
<r(  g)/ cr(r)  ratio  as  a  function  of  the  u  —  g  and  g  —  r  colors.  To  empha¬ 
size  trends,  we  bin  sources  and  present  median  values  for  each  bin. 

The  distribution  of  the  median  <j{u)  and  cr(g)  values  in  the  g  —  r 
versus  u  —  g  color-color  diagram  is  shown  in  Figure  5  ( top  pan¬ 
els).  The  RR  Lyrae  stars  show  larger  rms  scatter  (>0.2-0. 3  mag) 
in  the  u  and  g  bands  than  the  low-redshift  quasars  or  stars  from 
the  main  stellar  locus.  The  separation  of  higher  amplitude  RR 
Lyra  e-type  ab  stars  (w  —  g  ~  1.15  ,g  —  r  >  0.15)  and  lower  am¬ 
plitude  type  c  stars  (u  —  g  ~  1.15,  g  —  r  <  0.15)  is  easily  dis¬ 
cernible.  Quasars  also  show  slightly  larger  rms  scatter  in  the  u  band 
(~0.1  mag)  than  in  the  g  band  (^0.07  mag),  as  discussed  by 
Kinney  et  al.  (1991),  Ivezic  et  al.  (2004b),  and  Vanden  Berk  et  al. 
(2004).  If  we  define  the  degree  of  variability  as  the  rms  scatter  in 
the  g  band,  then,  on  average,  RR  Lyrae  stars  show  the  greatest  var¬ 
iability,  followed  by  quasars  and  main  stellar  locus  stars. 

Another  distinctive  characteristic  of  variable  sources  is  the  ratio 
of  flux  changes  in  different  bandpasses.  This  property  can  be  used 
to  select  different  types  of  variable  sources.  For  example,  RR  Lyrae 
stars  are  bluer  when  brighter,  a  behavior  used  by  Ivezic  et  al. 
(2000)  to  select  RR  Lyrae  stars  using  two-epoch  SDSS  data.  Here 
we  define  a  new  parameter,  a(g)/a(r),  to  express  the  ratio  of  flux 
changes  in  the  g  and  r  bands,  and  we  study  its  distribution  in  the 
g  —  r  versus  u  —  g  color-color  diagram.  In  particular,  we  examine 
this  distribution  and  its  median  values  for  three  dominant  classes 
of  variable  sources:  quasars,  RR  Lyrae  stars,  and  stars  from  the 
main  stellar  locus. 

Figure  5  {bottom  left  panel)  shows  the  distribution  of  median 
cr{g)/a(r)  values  as  a  function  of  u  —  g  and  g—r  colors.  Using 
Figure  5  we  note  that,  on  average, 

1.  RR  Lyrae  stars  have  cr(g)/a(r)  ~  1.4; 

2.  main  stellar  locus  stars  have  cr(g)/a(r)  ~  1; 

3.  quasars  show  a  cr{g)/a(r)  gradient  in  the  g  —  r  versus  u  —  g 
color-color  diagram. 

The  average  value  of  a{g)/a{r)  ~  1.4  in  region  IV  indicates 
that  RR  Lyrae  stars  dominate  the  variable  source  count  in  this  re¬ 
gion  (Ivezic  et  al.  2000).  While  Figure  5  only  presents  the  me¬ 
dian  values  of  the  rms  scatter.  Figure  6  shows  the  correlation 
between  the  rms  scatter  in  the  g  and  r  bands  for  individual  sources. 
The  sources  with  high  rms  scatter  in  the  g  band  also  have  high 
nns  scatter  in  the  r  band.  Variability  in  these  bands  correlates  with 
u  —  g  color,  and  for  RR  Lyrae  stars  (u  —  g  ~  1 )  has  the  form 
cr(g)  =  \Acr{r).  Figure  6  also  shows  how  sources  with  large  \* 1 2 3 
(x2  >  3)  have  large  rms  scatter  in  the  g  and  r  bands. 


2242 


SESAR  ET  AL. 


Vol.  134 


-0.5  0  0.5  1  1.5  2  2.5  3  3.5 

u-g 


-0.5  0  0.5  1  1.5 

9  -  r 


-0.5  0  0.5  1  1.5  2  2.5  3  3.5 

u-g 


LO 

1  |  1  1  1  1  |  1  1 

—  .  r  v 

1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 

■  •  ..  ,  " 

•  --k#'  - 

ISJ  O 

■M 

r.  * 

_ 1 _ 

1 

o 

lD 

jjs r  ' 

o 

1  1  1  1  1  1  1  1  1 

i  i  1  i  '  i  i  1  ''  ''  1  i  i  i  i 

-0.5  0  0.5  1  1.5  2 

r  -  i 


0.05  0.1  0.15  0.2 

<?(g) 

Fig.  4. — Distribution  of  1 8,329  candidate  variable  sources  brighter  than  g  =  2 1  in  representative  SDSS  color-magnitude  and  color-color  diagrams.  Candidate  variables 
are  color-coded  by  their  rms  scatter  in  the  g  band  (for  0.05-0.2,  see  the  legend;  for  larger  than  or  equal  to  0.2,  red).  Only  sources  brighter  than  g  —  20  are  plotted  in  the 
color-color  diagrams.  Note  that  RR  Lyrae  stars  [u  —  g  ~  1.15,  a(g)  ^  0.2  mag;  red  dots ]  and  low-redshift  quasars  [u  —  g  <  0.7,  o(g)  0. 1  mag;  green  dots ]  stand  out  as 
highly  variable  sources.  The  regions  marked  in  the  top  right  panel  are  used  for  quantitative  comparison  of  the  overall  and  variable  source  distributions  (see  Table  2). 


The  average  ratio  of  a(g)/a(r )  ~  1  (i.e.,  gray  flux  variations) 
for  stars  in  the  main  stellar  locus  suggests  that  variability  could 
be  caused  by  eclipsing  systems.  The  distribution  of7(g)  for  main 
stellar  locus  stars  further  strengthens  this  possibility,  as  discussed 
in  §  3.4. 

The  gradient  in  the  a(g)/a(r)  ratio  observed  for  low-redshift 
quasars  in  the  g  —  r  versus  u  —  g  color-color  diagram  suggests 
that  the  variability  correlation  between  the  g  and  r  bands  is  more 
complex  than  in  the  case  of  RR  Lyrae  or  main  stellar  locus  stars. 
Wilhite  et  al.  (2005)  show  that  the  photometric  color  changes  for 
quasars  depend  on  the  combined  effects  of  continuum  changes, 
emission-line  changes,  redshift,  and  the  selection  of  photometric 
bandpasses.  They  note  that  due  to  the  lack  of  variability  of  the 


lines,  measured  photometric  color  is  not  always  bluer  in  brighter 
phases  but  depends  on  redshift  and  the  filters  used.  To  verify  the 
dependence  of  broadband  photometric  variability  on  redshift,  we 
plot  a(g)/a(r)  versus  redshift  for  all  spectroscopically  confirmed 
unresolved  quasars  from  Schneider  et  al.  (2005)  that  are  in  stripe 
82,  as  shown  in  Figure  7.  We  confirm  that  the  broadband  pho¬ 
tometric  variability  depends  on  the  redshift,  and  that  the  a(g)/cr(r) 
gradient  in  the  g  —  r  versus  it  —  g  color-color  diagram  can  be  ex¬ 
plained  by  the  increase  in  rr(g)/a(r)  from  ~  1  to  ~  1 .6  in  the  1.0- 
1 .6  redshift  range.  This  effect  is  due  to  the  Mg  n  emission  line  (more 
stable  in  flux  than  the  continuum)  moving  through  the  r-band 
filter  over  this  redshift  range.  The  implied  correlation  of  the  u  —  g 
and  g  —  r  colors  with  redshift  is  consistent  with  the  discussion  by 


No.  6,  2007 


EXPLORING  THE  VARIABLE  SKY  WITH  SDSS 


2243 


TABLE  2 

The  Distribution  of  Candidate  Variable  Sources  in  the  g  —  r  versus  u  g  Diagram 


g  <  19  g<  20.5  g  <  22 


Region3  NAMEb  %  All"  %  Var.d  Var./Alle  NVJN{ u  %  A11c  %  Var.d  Var./Alle  NwJN{n  %  All°  %  Var.d  Var./AlP  NVJN{X{ 


I  .  White  dwarfs  0.14  0.59  4.25  3.50  0.24  0.40  1.69  3.34  0.28  0.45  1.64  4.51 

II  .  Low-redshift  QSOs  0.45  30.88  68.83  56.58  1.90  62.90  33.03  65.10  4.07  70.01  17.22  47.30 

III  .  dM/WD  pairs  0.08  0.53  6.54  5.37  0.83  2.08  2.50  4.92  1.21  3.79  3.13  8.61 

IV  .  RRLyrae  stars  1.28  16.81  13.11  10.78  1.33  7.95  5.99  11.81  1.48  6.41  4.33  11.90 

V  .  Stellar  locus  stars  96.27  48.77  0.51  0.42  94.49  25.15  0.27  0.52  91.89  18.33  0.20  0.55 

VI  .  High-redshift  QSOs  1.78  2.42  1.36  1.12  1.21  1.52  1.26  2.48  1.07  1.01  0.95  2.60 

Total  count  411667  3384  662195  13051  748067  20553 


J  These  regions  are  defined  in  the  g  r  vs.  u  g  color-color  diagram,  with  their  boundaries  shown  in  Fig.  4. 
b  An  approximate  description  of  the  dominant  source  type  found  in  the  region. 

c  The  fraction  of  all  sources  in  a  magnitude-limited  sample  found  in  this  color  region,  with  the  magnitude  limits  listed  on  top. 
d  The  number  of  candidate  variables  from  the  region,  expressed  as  a  fraction  of  all  variable  sources. 
c  The  ratio  of  values  listed  in  the  two  columns  immediately  preceding. 

*  The  number  of  candidate  variables  from  the  region,  expressed  as  a  fraction  of  all  sources  in  that  region. 


Richards  et  al.  (2002).  The  lack  of  noticeable  correlation  of  a(  g) 
with  redshift  is  due  to  the  combined  effects  of  the  dependence  of 
cr(g)  on  the  rest- frame  wavelength  and  time,  which  cancel  out  (for 
a  detailed  model,  see  Ivezic  et  al.  2004b). 

3.4.  Skewness  as  a  Proxy  for  the  Dominant 
Variability’  Mechanism 

Light-curve  skewness,  a  measure  of  the  light-curve  asymmetry, 
provides  additional  information  on  the  type  of  variability.  Neg¬ 
atively  skewed,  asymmetric  light  curves  indicate  variable  sour¬ 
ces  that  spend  more  time  fainter  than  (mmm  +  »?max)/ 2,  where  m mjn 
and  mmax  are  magnitudes  at  the  minimum  and  maximum.  Type 
ab  RR  Lyrae  stars,  for  example,  have  negatively  skewed  light 
curves  (7  ~  —0.5;  Wils  et  al.  2006).  Positively  skewed,  asym¬ 
metric  light  curves  indicate  variable  sources  that  spend  more  time 
brighter  than  (/wmin  +  wmax)/2  (e.g.,  eclipsing  systems).  Sources 
with  symmetric  light  curves  will  have  7  ~  0. 

Figure  5  ( bottom  right  panel)  shows  the  distribution  of  the 
median  7 (3)  as  a  function  of  position  in  the  g  —  r  versus  u  —  g 
color-color  diagram.  On  average,  quasars  and  c-type  RR  Lyrae 
stars  (u  —  g  ~  1.15,  g  —  r  <  0.15)  have  7 (3)  ~  0,  ab- type  RR 
Lyrae  stars  (u  —  3  ~  1.15,  g—r  >  0.15)  have  negative  skew¬ 
ness  [7(3)  ~  —0.5],  and  stars  in  the  main  stellar  locus  have  pos¬ 
itive  skewness. 

Figure  8  shows  the  distribution  of  the  light-curve  skewness  in 
the  ugi  bands  for  spectroscopically  confirmed  unresolved  quasars 
from  Schneider  et  al.  (2005),  which  are  in  stripe  82;  candidate 
RR  Lyrae  stars  (selection  details  are  discussed  in  §  4);  and  main 
stellar  locus  stars  from  our  variable  sample.  Stars  in  the  main 
stellar  locus  show  a  bimodal  7(3)  distribution.  This  distribution 
suggests  at  least  two,  and  perhaps  more,  different  populations  of 
variables.  Indeed,  when  spectroscopically  confirmed  M  dwarfs  are 
selected,  a  third  peak  appears  at  7(3)  ~  2.5,  possibly  associated 
with  flaring  M  dwarfs  (A.  Kowalski  et  al.  2007,  in  preparation). 
A  bimodality  similar  to  that  in  the  3  band  is  also  discernible  in  the 
r  band,  while  it  is  less  pronounced  in  the  i  band  and  not  detected 
in  the  u  and  z  bands  (the  r  and  z  data  are  not  shown).  The  bi¬ 
modality  in  the  u  and  z  bands  is  not  detected  due  to  high  photo¬ 
metric  errors  in  these  bands  at  faint  magnitudes.  The  measurement 
uncertainty  decreases  the  asymmetry  of  the  light  curve,  and  7 
approaches  0.  The  less  pronounced  bimodality  in  the  i  band  might 
be  due  to  astrophysical  reasons,  since  the  photometric  errors  in  the 
i  band  are  comparable  to  3-  and  r-band  photometric  errors. 


A  comparison  of  the  u  —  3  and  g—r  color  distributions  for  var¬ 
iable  main  stellar  locus  stars  brighter  than  3=19  and  a  subset 
with  highly  asymmetric  light  curves  [7(3)  >  2.5]  is  shown  in 
Figure  9.  The  subset  with  asymmetric  light  curves  has  an  increased 
fraction  of  stars  with  colors  u  —  g  ~  2.5  and  3  —  r  ~  1.4  that  cor¬ 
respond  to  M  stars.  This  may  indicate  that  M  stars  have  a  higher 
probability  of  being  associated  with  an  eclipsing  companion  than 
stars  with  earlier  spectral  types.  However,  the  selection  effects  are 
probably  important,  since  a  companion  is  easier  to  detect  (due  to 
the  low  luminosity  of  M  dwarfs).  A.  Kowalski  et  al.  (2007,  in  prep¬ 
aration)  examine  these  issues  using  light-curve  data  on  a  sample  of 
spectroscopically  confirmed  M  dwarfs.  Finally,  quasars  have  sym¬ 
metric  light  curves  (7  ~  0)  and  their  distribution  of  skewness  does 
not  change  between  bands. 

While  the  value  of  the  skewness  for  individual  sources  may 
strongly  depend  on  the  number  of  available  epochs,  the  median 
binned  values  of  skewness  are  not  very  sensitive  to  the  number  of 
available  epochs.  We  verify  this  claim  by  repeating  the  analysis 
presented  above  on  a  subsample  of  sources  with  at  least  eight  de¬ 
tections  in  the  ugr  bands.  A  more  conservative  cut  results  in  a 
smaller  sample,  but  general  trends  and  results  remain  the  same. 

4.  THE  MILKY  WAY  HALO  STRUCTURE  TRACED 
BY  CANDIDATE  RR  LYRAE  STARS 

Studies  of  substructures  in  the  Galactic  halo,  such  as  clumps 
and  streams,  can  constrain  the  formation  history  of  the  Milky  Way. 
Among  the  best  tracers  for  studying  the  outer  halo  are  RR  Lyrae 
stars,  because  they  are  nearly  standard  candles,  are  sufficiently 
bright  to  be  detected  at  large  distances  (5-100  kpc  for  14  < 
r  <  20.7),  and  are  sufficiently  numerous  to  trace  the  halo  sub¬ 
structure  with  a  high  spatial  resolution.  The  General  Catalog  of 
Variable  Stars  (GCVS;  Kholopov  et  al.  1988)  lists27  RR  Lyrae 
stars  as  RR  Lyrae  type  ab  (RRab)  and  type  c  (RRc)  stars.  RRa/t 
stars  have  asymmetric  light  curves,  periods  from  0.3  to  1.2  days, 
and  amplitudes  from  V  ~  0.5  to  2.  RRc  stars  have  nearly  sym¬ 
metric,  sometimes  sinusoidal,  light  curves  with  periods  from  0.2 
to  0.5  days  and  amplitudes  not  greater  than  V  ~  0.8.  In  this  work 
we  assume  Mv  =  0.7  as  the  absolute  K-band  magnitude  of  RRah 
and  RRc  stars.  A  comprehensive  review  of  RR  Lyrae  stars  can  be 
found  in  Smith  (1995). 


27  A  list  of  GCVS  variability  types  can  be  found  at  http://www.sai.msu.su/ 
groups/cluster/gcvs/gcvs/iii/vartype.txt. 
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u<20.5,  medion  a(u)  (lin  scale:  0.05  to  0.30)  g<20.5,  median  a(g)  (lin  scale:  0.05  to  0.30) 
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Fig.  5. — Distribution  of  the  rms  scatter  o(u)  ( top  left),  nns  scatter  o(g)  (top  right),  o(g)/cr(r)  ratio  ( bottom  left),  and  7  (g)  (bottom  right)  for  the  variable  sample  in  the 
g  —  r  vs.  u  —  g  color-color  diagram.  Sources  are  binned  in  0.05  mag  wide  bins,  and  the  median  values  are  color-coded.  Color  ranges  are  given  at  the  top  of  each  panel,  from 
blue  to  red,  where  green  is  in  the  midrange.  Values  outside  the  range  saturate  in  blue  or  red.  Contours  outline  the  count  distributions  on  a  linear  scale  in  steps  of  1 5%.  The 
flux  limit  is  g  <  20.5,  with  an  additional  u  <  20.5  limit  in  the  top  left  panel.  Top  panels :  Note  how  higher  amplitude  RRaft  (u  —  g  —  1.15 ,  g  —  r  >  0.15)  and  lower 
amplitude  RRc  stars  (u  —  g  ~  1.15,  g  —  r  <  0.15)  separate  in  these  plots.  Bottom  left :  On  average,  RR  Lyrae  stars  have  cr(g)/o(r)  ~  1.4,  main  stellar  locus  stars  have 
o(g)/o(r)  ~  1,  and  low-redshift  quasars  show  a  gradient  of  o(g)/o(r)  values.  Bottom  right'.  On  average,  quasars  and  RRc  stars  have  7 (g)  ~  0,  If  Rab  stars  have  negative 
skewness,  and  stars  in  the  main  stellar  locus  have  positive  skewness. 


In  this  section  we  fine-tune  criteria  for  selecting  candidate  RR 
Lyrae  stars  and  estimate  the  selection  completeness  and  efficiency. 
Using  selected  candidate  RR  Lyrae  stars,  we  recover  a  known  halo 
clump  associated  with  the  Sgr  dwarf  tidal  stream  and  find  several 
new  halo  substructures. 

4. 1 .  Criteria  for  Selecting  RR  Lyrae  Stars 

Figures  3-5  show  that  RR  Lyrae  stars  occupy  a  well-defined 
region  (region  IV)  in  the  g  —  r  versus  u  —  g  color-color  diagram, 
and  Figure  6  shows  how  RR  Lyrae  stars  follow  the  cr(g)  = 
1 .4(7(7-)  relation.  Motivated  by  these  results,  we  introduce  color 
and  a(g)/a(r )  cuts  to  specifically  select  candidate  RR  Lyrae  stars 


from  the  variable  sample  and  study  their  distribution  in  the  rms 
scatter-color-light-curve  skewness  parameter  space. 

RR  Lyrae  stars  have  distinctive  colors  and  can  be  selected 
with  the  following  criteria  (Ivezic  et  al.  2005): 


0.98  <  u  —  g  <  1.30, 

(9) 

-0.05  <  Dug  <  0.35, 

(10) 

0.06  <  Dgr  <  0.55, 

(11) 

—0.15  <  r  —  i  <  0.22, 

(12) 

—0.21  </  —  z  <  0.25, 

(13) 
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Fig.  6. — Distribution  of  candidate  variable  sources  in  the  g  <  20.5  flux- 
limited  sample.  This  is  shown  by  linearly  spaced  contours  and  by  symbols  color- 
coded  with  the  u  —  g  color  for  sources  with  a(g)  >  0.05  and  o(r)  >0.05  mag.  The 
dotted  lines  show  the  adopted  o(g,  r)  selection  cut.  The  solid  line  shows  a(g)  = 
cr(r),  while  the  dashed  line  shows  the  a(g)  —  1  Aa(r)  relation  representative  of 
RR  Lyrae  stars.  Note  that  sources  following  the  o(g)  —  \Ao(r)  relation  tend  to 
have  u  —  g  ~  1,  as  expected  for  RR  Lyrae  stars.  The  gray-scale  background 
shows  the  fraction  of  x2(<?,  r)  >  3  that  also  has  <r(g)  >  x  and  o(r)  >  y  and  dem¬ 
onstrates  that  large  \2  sources  also  have  large  a. 


where 
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Dug  =  (u  -  g)  +  0.67(3  -r)-  1.07  (14) 
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Fig.  7. —  Dependence  of  o(g)/a(r)  (top),  g  —  r  (middle),  and  o(g)  (bottom)  on 
redshift  for  a  sample  of  spectroscopically  confirmed  unresolved  quasars  from 
Schneider  et  al.  (2005).  The  a(g)/a(r)  gradient  shown  in  Fig.  5  (bottom  left  panel) 
can  be  explained  by  the  local  maximum  of  o(g)/a(r)  in  the  1 .0-1.6  redshift  range. 


Dgr  =  0.450/  -  g)  -  (g  -  r)  -  0.12.  (15) 

We  apply  these  cuts  to  our  sample  of  candidate  variables  and 
select  846  sources.  It  is  implied  by  Ivezic  et  al.  (2005)  that  RR 
Lyrae  stars  should  always  stay  within  these  color  boundaries,  even 
though  their  colors  change  as  a  function  of  phase.  Their  distribu¬ 
tion  in  the  g  —  r  versus  u  -  g  color-color  diagram  and  nns  scatter 
in  the  g  band  are  shown  in  Figure  10  ( top  left  panel).  The  dis¬ 
tribution  of  sources  in  the  RR  Lyrae  region  is  inhomogeneous. 
Sources  with  large  rms  scatter  in  the  g  band  (^0.2  mag)  are  cen¬ 
tered  around  u  —  g  ~  1.15  and  are  separated  by  g  —  r  ~  0 . 1 2  into 
two  groups.  A  comparison  with  Figure  3  from  Ivezic  et  al.  (2005) 
suggests  that  these  large  rms  scatter  sources  might  be  RR«6 
{g  —  r  >  0.12)  and  RRc  (g  —  r  <  0.12)  stars.  Small  rms  scatter 
sources  (V0. 1  mag)  have  a  fairly  uniform  distribution  and  are 
slightly  bluer,  with  1.1. 

The  distribution  of  sources  from  the  RR  Lyrae  region  in  the  a(r) 
versus  a(g)  diagram  is  presented  in  Figure  10  ( top  right  panel). 
The  majority  of  large  rms  scatter  sources  follow  the  a(g)  = 
1 .4(j(r)  relation,  as  expected  for  RR  Lyrae  stars.  Since  RR  Lyrae 
stars  are  bluer  when  brighter  or,  equivalently,  have  greater  rms 
scatter  in  the  g  band  than  in  the  r  band,  we  require  1  <  a(g)/a(r)  < 
2.5  and  select  683  candidate  RR  Lyrae  stars. 


A  comparison  of  u  —  g  color  distributions  for  candidate  RR 
Lyrae  stars  and  of  sources  with  RR  Lyrae  colors  but  not  tagged  as 
RR  Lyrae  stars,  presented  in  Figure  10  ( bottom  left  panel),  dem¬ 
onstrates  the  robustness  of  the  RR  Lyrae  selection.  The  two  dis¬ 
tributions  are  very  different  (the  probability  that  they  are  the  same 
is  1 0~4,  as  given  by  the  Kolmogorov-Smimov  test),  with  the  can¬ 
didate  RR  Lyrae  distribution  peaking  at  m  —  g  ~  1.15,  as  expected 
for  RR  Lyrae  stars. 

One  property  that  distinguishes  RRa7>  from  RRc  stars  is  the 
shape  (or  skewness)  of  their  light  curves  (in  addition  to  light-curve 
amplitude  and  period).  RRab  stars  have  asymmetric  light  curves, 
while  RRc  light  curves  are  symmetric.  In  Figure  1 0  ( top  left  panel) 
we  noted  that  g  —  r  ~  0. 12  seemingly  separates  high  rms  scatter 
sources  into  two  groups.  If  g  —  r  ~  0. 12  is  the  boundary  between 
the  RR«6  and  RRc  stars,  then  the  same  boundary  should  show  up 
in  the  distribution  of  light-curve  skewness  as  a  function  of  g  —  r 
color.  As  shown  in  Figure  10  ( bottom  right  panel),  this  is  indeed 
the  case.  On  average,  sources  with  g  —  r  <  0.12  have  7(3)  ~  0 
(symmetric  light  curves),  as  do  RRc  stars,  while  g  —  r  >  0.12 
sources  have  7 (g)  ~  —0.5  (asymmetric  light  curves)  typical  of 
RRa6  stars. 

We  show  in  §  4.2  that  candidate  RR  Lyrae  stars  with  7 (g)  >  1 
are  contaminated  by  eclipsing  variables.  Therefore,  to  reduce  the 
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Fig.  8. — Light-curve  skewness  distribution  in  the  ugi  bands  for  spectro¬ 
scopically  confirmed  unresolved  quasars  ( dotted  line),  candidate  RR  Lyrae  stars 
(i dashed  line),  and  variable  main  stellar  locus  stars  ( solid  line,  region  V;  see  Fig.  4 
for  the  definition).  The  distribution  of  the  skewness  in  the  r  band  is  similar  to  the 
(/-band  distribution,  and  the  distribution  of  skewness  in  the  z  band  is  similar  to  the 
w-band  distribution  (therefore,  the  r  and  z  data  are  not  shown).  Stars  in  the  main 
stellar  locus  show  bimodality  in  7 (g),  suggesting  at  least  two,  and  perhaps  more, 
different  populations  of  variables.  Similar  bimodality  is  also  discernible  in  the  r 
band,  while  it  is  less  pronounced  in  the  i  band  and  not  detected  in  the  u  and  z  bands 
(due  to  higher  photometric  errors).  Quasars  have  symmetric  light  curves  (7  ~  0), 
and  their  distribution  of  skewness  does  not  change  between  bands.  [See  the 
electronic  edition  of  the  Journal  for  a  color  version  of  this  figure .] 


contamination  by  eclipsing  variables,  we  also  require  7(5)  <  1  and 
select  634  sources  as  our  final  sample  of  candidate  RR  Lyrae  stars. 

4.2.  Completeness  and  Efficiency 

The  selection  completeness,  defined  as  the  fraction  of  re¬ 
covered  RR  Lyrae  stars,  will  depend  on  the  color  cuts,  the  a(g,  r ) 
cutoff,  and  the  number  of  observations.  The  color  cuts  (eqs.  [9]- 
[15])  applied  in  §  4.1  were  chosen  to  minimize  contamination  by 
sources  other  than  RR  Lyrae  stars  while  maintaining  an  almost 
100%  completeness  (Ivezic  et  al.  2005).  With  the  a(g,  r)  cutoff  at 
0.05  mag  (small  compared  to  the  ~  1  mag  typical  peak-to-peak 
amplitudes  of  RR  Lyrae  stars)  and  a  fairly  large  number  of  ob¬ 
servations  per  source  (median  of  10),  we  estimate  the  RR  Lyrae 
selection  completeness  to  be  ^95%  (see  the  Appendix  in  Ivezic 
et  al.  2000). 

To  determine  the  selection  efficiency,  defined  as  the  fraction  of 
true  RR  Lyrae  stars  in  the  RR  Lyrae  candidate  sample,  we  posi¬ 
tionally  match  683  candidate  RR  Lyrae  stars  selected  by  1  < 
(j(  g)/<j(r)  <  2.5  to  a  sample  of  RR  Lyrae  sources  selected  from  the 
SDSS  Light-Motion-Curve  Catalog  (LMCC;  D.  Bramich  et  al. 
2007,  in  preparation).  This  catalog  covers  a  slightly  smaller  region 
of  the  sky  (20h42m  <  aJ2ooo.o  <  03h16m,  -1.26°  <  8n 000.0  < 
+ 1 .26°)  than  that  discussed  here  but  includes  more  densely  sam¬ 
pled  SDSS-I1  observations  that  allow  the  construction  of  light 
curves.  We  match  613  candidates,  while  70  candidate  RR  Lyrae 
stars  from  our  sample  are  not  in  the  LMCC  footprint.  Following 
the  classification  based  on  phased  light  curves  by  N.  De  Lee  et  al. 
(2007,  in  preparation)  we  find  that  71%  of  1  <  a(g)/a(r)  <  2.5, 
7 (g)  <  1  sources  in  our  candidate  RR  Lyrae  sample  are  classified 
as  RRflb  and  RRc,  28%  are  classified  as  variable  non-RR  Lyrae 
stars,  and  only  1%  of  sources  in  this  sample  are  classified  as  spu¬ 
rious,  nonvariable  sources.  While  we  do  not  know  exactly  the 
completeness  of  the  N.  De  Lee  et  al.  (2007,  in  preparation) 
samples  (efficiency  is  about  100%,  as  verified  by  the  visual  in¬ 
spection  of  light  curves),  we  speculate  that  it  is  likely  very  high, 
given  more  densely  sampled  observations  available  in  the  SDSS-II 
Supernova  Survey.  The  most  significant  contamination  in  our 
candidate  RR  Lyrae  sample  comes  from  a  population  of  variable 
sources  bluer  than  11  —  p  ~  1.1  (Fig.  11,  dotted  line  in  bottom  left 
panel),  possibly  Population  II  S  Scuti  stars,  also  known  as  SX 
Phoenicis  stars  (Hoffmeister  et  al.  1985). 

The  top  left  and  the  bottom  right  panels  in  Figure  1 1  show  that 
RRab-  and  RRc-dominated  regions  are  separated  by  g  —  r  ~ 
0. 1 2,  as  already  hinted  in  Figure  10.  Also,  variable  non-RR  Lyrae 
sources  with  7 (g)  >  1  are  classified  by  N.  De  Lee  et  al.  (2007,  in 
preparation)  as  eclipsing  variables,  justifying  our  7 (g)  <  1  cut. 

To  summarize,  using  color  criteria  and  criteria  based  on  a(g), 
a(r),  and  7(g),  RRLyrae  stars  are  selected  with  S95%  complete¬ 
ness  and  ~70%  efficiency. 

4.3.  The  Spatial  Distribution  of  Candidate  RR  Lyrae  Stars 

Using  the  selection  criteria  from  §  4. 1  we  isolate  634  RR  Lyrae 
candidates.  The  magnitude-position  diagram  for  these  candidates 
within  2.5°  of  the  celestial  equator  is  shown  in  Figure  12. 

As  discussed  by  Ivezic  et  al.  (2005),  an  advantage  of  the  data 
representation  utilized  in  Figure  12  (magnitude-right  ascension 
diagram)  is  its  simplicity;  only  “raw”  data  are  shown,  without 
any  postprocessing.  However,  the  magnitude  scale  is  logarithmic, 
and  thus  the  spatial  extent  of  structures  is  heavily  distorted.  In  order 
to  avoid  these  shortcomings,  we  have  applied  a  Bayesian  method 
for  estimating  continuous  spatial  density  distribution  developed 
by  Ivezic  et  al.  (2005,  their  Appendix  B).  The  resulting  density  map 
is  shown  in  Figure  1 3  {right).  The  advantage  of  that  representation 
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Fig.  9. — Comparison  of  the  u  —  g  {left)  and  g  —  r  {right)  color  distributions  for  variable  main  stellar  locus  stars  brighter  than  g  —  19  {dashed  lines)  and  a  subset  with 
highly  asymmetric  light  curves  [7 {g)  >  2.5;  solid  lines].  The  subset  with  highly  asymmetric  light  curves  has  an  increased  fraction  of  stars  with  colors  u  —  g  ~  2.5  and 
g  —  r  ~  1.5,  characteristic  of  M  stars.  [See  the  electronic  edition  of  the  Journal  for  a  color  version  of  this  figure.] 


is  that  it  better  conveys  the  significance  of  various  local  over¬ 
densities.  For  comparison,  we  also  show  a  map  of  the  northern  part 
of  the  equatorial  strip  constructed  using  two-epoch  data  discussed 
by  Ivezic  et  al.  (2000). 

We  detect  several  new  halo  substructures  at  >3  a  significance 
(compared  to  expected  Poissonian  fluctuations)  and  present  their 


approximate  locations  and  properties  in  Table  3.  In  order  to  as¬ 
sess  the  statistical  significance  of  the  clumps  seen  in  Figure  13,  we 
have  performed  a  detailed  analysis  using  random  samples  of  the 
same  size  as  the  candidate  RR  Lyrae  sample.  The  samples  were 
constructed  using  flat  distributions  in  magnitude  and  right  ascen¬ 
sion,  which  approximately  correspond  to  a  I  //7s  RR  Lyrae  volume 


u  -  g  g  -  r 

Fig.  10. —  Top  left  Distribution  of  846  candidate  variable  sources  from  the  RR  Lyrae  region  {dashed  lines',  see  Fig.  3  in  Ivezic  et  al.  2005)  in  the  g  —  r  vs.  w  —  g  color- 
color  diagram.  The  symbols  mark  the  time-averaged  values  and  are  color-coded  by  fig)  (0.05-0.2;  blue  to  red).  The  dotted  line  shows  the  boundary  between  the  RRab-  and 
RRodominated  regions.  Top  right.  Sources  from  the  top  left  panel  divided  into  three  groups  according  to  their  a{g)lfir)  values:  candidate  RR  Lyrae  stars  with 
1  <  fig)!  fir)  <  2.5  {large  dots),  sources  with  fig)!  fir)  <  1  {triangles),  and  sources  with  fig)/ fir)  >  2.5  {squares).  Small  dots  show  sources  with  RR  Lyrae  colors 
that  fail  the  variability  criteria.  The  dashed  lines  show  the  fig)  =  fir)  and  fig)  —  2.5 a{r)  relations,  while  the  dotted  line  shows  the  fig)  =  1  A  fir)  relation.  Bottom  left. 
Comparison  of  the  u  —  g  color  distributions  for  candidate  RR  Lyrae  stars  {solid  line)  and  sources  with  RR  Lyrae  colors  but  not  tagged  as  RR  Lyrae  stars  {dashed  line). 
Bottom  right:  Dependence  of  fig)  on  the  g  —  r  color  for  candidate  RR  Lyrae  stars.  The  boundary  g  —  r  —  0.12  {dotted  line)  separates  candidate  RR  Lyrae  stars  into  those 
with  asymmetric  [7(g)  ~  —0.5]  and  symmetric  [fig)  ~  0]  light  curves,  corresponding  to  RRab  and  RRc  stars,  respectively.  The  condition  fig)  <  1  {dashed  line)  is  used 
to  reduce  the  contamination  of  the  RR  Lyrae  sample  by  eclipsing  variables. 
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Fig.  1 1 . — Distribution  of  candidate  RR  Lyrae  stars  selected  with  1  <  a{g)la{r)  <2.5  and  classified  by  N.  De  Lee  et  al.  (2007,  in  preparation)  shown  in  diagrams  similar 
to  Fig.  10.  Symbols  show  RRab  stars  ( red  dots),  RRc  stars  {blue  dots),  variable  non-RR  Lyrae  stars  {green  dots),  and  nonvariable  sources  {open  squares',  only  four  sources). 
A  comparison  of  the  u  —  g  color  distribution  for  RRab  {solid  line),  RRc  {dashed  line),  and  variable  non-RR  Lyrae  stars  {dotted  line)  is  shown  in  the  bottom  left  panel. 


density  distribution,  where  R  is  the  distance  from  the  Galactic  cen¬ 
ter.  We  find  that  the  contamination  by  non-RR  Lyrae  stars  is  not 
an  issue  because  this  effect  tends  to  minimize  the  structure,  rather 
than  enhance  it.  The  random  samples  often  have  clumplike  fea¬ 
tures  due  to  Poisson  fluctuations.  By  comparing  the  histograms  of 
density  distributions  for  random  samples  and  the  observed  can¬ 
didate  RR  Lyrae  sample,  we  conclude  that  clumps  I  and  K  are  con¬ 
sistent  with  random  fluctuations,  and  one  or  two  of  clumps  D,  F, 


RA  (deg) 


Fig.  12. — Magnitude-position  distribution  of  634  stripe  82  RR  Lyrae  candi¬ 
dates  within  —55°  <  R.A.  <  60°  and  |decl.|  <  1 .21° .  Approximate  distance  (shown 
on  the  right  y-axis)  is  calculated  assuming  Mr  =  0.7  mag  for  RR  Lyrae  stars. 
Dashed  lines  show  where  sample  completeness  decreases  from  approximately 
99%  to  60%  due  to  the  \'2  cut  (see  Fig.  2,  bottom  right  panel').  Closed  curves  are 
remapped  ellipses  and  circles  from  Fig.  13  that  mark  halo  substructure.  [See  the 
electronic  edition  of  the  Journal  for  a  color  version  of  this  figure .] 


H,  and  L  may  also  be  caused  by  such  fluctuations.  The  remaining 
seven  clumps  (A,  B,  C,  E,  G,  J,  and  M)  are  inconsistent  with  ran¬ 
dom  fluctuations  and  likely  represent  real  halo  substructure.  Stars 
associated  with  these  clumps  account  for  about  50%  of  the  sam¬ 
ple.  This  estimate  for  the  fraction  of  halo  stars  that  are  associated 
with  substructure  is  in  good  agreement  with  estimates  based  on 
main-sequence  stars  in  the  inner  halo  (out  to  about  30  kpc)  by  Juric 
et  al.  (2007)  and  Bell  et  al.  (2007). 

The  most  distant  clump  ( M )  is  1 06  kpc  from  the  Galactic  center. 
The  strongest  clump  in  the  left  wedge  belongs  to  the  Sgr  dwarf 
tidal  stream,  as  does  clump  C  in  the  right  wedge  ( Ivezic  et  al.  2003). 
Similarly  to  the  behavior  of  main-sequence  stars  discussed  by  Bell 
etal.  (2007),  the  apparent  “dumpiness”  ofthe  candidate  RR  Lyrae 
distribution  increases  with  increasing  radius,  as  predicted  by  CDM 
simulations  (Bullock  et  al.  2001).  A  detailed  comparison  of  their 
models  with  the  data  presented  here  will  be  discussed  elsewhere 
(B.  Sesar  et  al.  2007,  in  preparation). 

5.  ARE  ALL  QUASARS  VARIABLE? 

The  optical  continuum  variability  of  quasars  has  been  recog¬ 
nized  since  their  first  optical  identification  (Matthews  &  Sandage 
1963),  and  it  has  been  proposed  and  utilized  as  an  efficient  method 
for  their  discovery  (van  den  Bergh  et  al.  1973;  Hawkins  1983; 
Koo  et  al.  1986;  Hawkins  &  Veron  1995;  Rengstorf  et  al.  2004). 
The  observed  characteristics  of  the  variability  of  quasars  are 
frequently  used  to  constrain  the  origin  of  their  emission  (e.g., 
Kawaguchi  et  al.  1998  and  references  therein;  Martini  &  Schneider 
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TABLE  3 

Approximate  Locations  and  Properties  of  Detected  Overdensities 


Label3 

Nb 

(R.A.)C 

(d)d 

W 

<«  -  9)f 

{9  ~  r)E 

A . 

84 

330.95 

21 

17.02 

1.14 

0.18 

B . 

144 

309.47 

22 

16.76 

1.12 

0.16 

C . 

54 

33.69 

25 

17.61 

1.13 

0.20 

D . 

8 

347.91 

29 

18.02 

1.14 

0.23 

E . 

11 

314.06 

43 

18.84 

1.09 

0.20 

F . 

11 

330.26 

48 

19.16 

1.07 

0.20 

G . 

10 

354.81 

55 

19.46 

1.10 

0.22 

H . 

7 

43.57 

57 

19.32 

1.05 

0.04 

I . 

4 

311.34 

72 

19.98 

1.08 

0.11 

J . 

26 

353.58 

81 

20.21 

1.11 

0.20 

K . 

8 

28.39 

84 

20.35 

1.10 

0.20 

L . 

3 

339.01 

92 

20.45 

1.06 

0.16 

M . 

5 

39.45 

102 

20.73 

1.07 

0.11 

3  The  overdensity’s  label  from  Fig.  13. 
b  Number  of  candidate  RR  Lyrae  stars  in  the  overdensity. 
c  Median  right  ascension. 

d  Median  heliocentric  distance  (in  kpc),  computed  using  eq.  (3)  from  Ivezic  et  al. 
(2005 )  and  My  =0.7  mag  as  the  absolute  magnitude  of  RR  Lyrae  stars  in  the  Kband. 
e  Median  /  -band  magnitude. 
f  Median  u  —  g  color. 

8  Median  g  —  r  color. 

2003;  Pereyra  et  al.  2006).  Recently,  significant  progress  in  the 
description  of  quasar  variability  has  been  made  by  employing  the 
SDSS  data  (de  Vries  et  al.  2003, 2005;  Ivezic  et  al.  2004b;  Vanden 
Berk  et  al.  2004;  Sesar  et  al.  2006).  Here  we  expand  these  studies 
by  quantifying  the  efficiency  of  quasar  discovery  using  variability. 


A  preliminary  comparison  of  color-  and  variability-based  meth¬ 
ods  for  selecting  quasars  using  SDSS  data  was  presented  by  Ivezic 
et  al.  (2004c).  They  found  that  47%  of  spectroscopically  confirmed 
unresolved  quasars  with  U V  excess  have  a  g-band  magnitude  dif¬ 
ference  between  two  observations  obtained  2  yr  apart  larger  than 
0.15  mag.  We  can  improve  on  their  analysis  because  now  there 
are  significantly  more  observations  obtained  over  a  longer  time 
period.  Since  quasars  vary  erratically  and  the  nns  scatter  of  their 
variability  (the  so-called  structure  function)  increases  with  time 
(e.g.,  Vanden  Berk  et  al.  2004  and  references  therein),  the  variabil¬ 
ity  selection  completeness  is  expected  to  be  higher  than  ~50%, 
obtained  by  Ivezic  et  al.  (2004c). 

First,  although  the  adopted  variability  selection  criteria  dis¬ 
cussed  above  are  fairly  conservative,  we  find  that  at  least  63%  of 
low-redshift  quasars  are  variable  at  the  >0.05  mag  level  (simul¬ 
taneously  in  the  g  and  r  bands  over  observer’s  timescales  of  sev¬ 
eral  years)  in  the  g  <  20.5  flux-limited  sample.  Second,  even  this 
estimate  is  only  a  lower  limit;  given  the  spectroscopic  confirma¬ 
tion  for  a  large  flux-limited  sample  of  quasars,  it  is  possible  to  relax 
the  adopted  variability  selection  cutoff  without  prohibitive  contam¬ 
ination  by  nonvariable  sources. 

There  are  2492  unresolved  quasars  in  the  catalog  of  spectro¬ 
scopically  confirmed  SDSS  quasars  (Schneider  et  al.  2005)  from 
stripe  82.  The  fraction  of  these  objects  that  vary  more  than  a  in  the 
g  and  r  bands,  as  a  function  of  tr,  is  shown  in  Figure  14.  We  also 
show  the  analogous  fraction  for  stars  from  the  stellar  locus.  About 
93%  of  quasars  vary  with  a  >  0.03  mag.  For  a  small  fraction  of 
these  objects  the  measured  rms  scatter  is  due  to  photometric  noise, 
and  the  stellar  data  limit  this  fraction  to  at  most  3%.  Conserva¬ 
tively  assuming  that  none  of  these  3%  of  stars  are  intrinsically 


Fig.  13. — Left :  Spatial  distribution  of  candidate  RR  Lyrae  stars  discovered  by  SDSS  along  the  celestial  equator.  Distance  is  calculated  assuming  eq.  (3)  from  Ivezic  et  al. 
(2005)  and  My  —  0.7  mag  as  the  absolute  magnitude  of  RR  Lyrae  stars  in  the  V  band.  The  right  wedge  corresponds  to  candidate  RR  Lyrae  stars  selected  in  this  work  (634 
candidates,  shown  in  Fig.  12),  and  the  left  wedge  is  based  on  the  sample  from  Ivezic  et  al.  (2000;  296  candidates).  Right.  Number  density  distribution  of  candidate  RR  Lyrae 
stars  shown  in  the  left  panel,  computed  using  an  adaptive  Bayesian  density  estimator  developed  by  Ivezic  et  al.  (2005).  The  color  scheme  represents  the  number  density 
multiplied  by  the  cube  of  the  galactocentric  radius  and  displayed  on  a  logarithmic  scale  with  a  dynamic  range  of  300  ( light  blue  to  red).  Green  corresponds  to  the  mean 
density;  both  wedges  with  the  data  would  have  this  color  if  the  halo  number  density  distribution  followed  a  perfectly  smooth  r-3  power  law.  Purple  marks  the  regions  with 
no  data.  The  yellow/orange  regions  are  formally  about  3  a  significant  overdensities,  and  red  regions  have  an  even  higher  significance  (using  only  the  count  variance).  A 
comparison  of  this  map  with  those  generated  using  random  samples  of  the  same  size  suggests  that  clumps  I  and  K  are  consistent  with  random  fluctuations;  one  or  two  of 
clumps  D,  F,  H,  and  L  may  also  be  caused  by  such  fluctuations;  and  it  is  highly  likely  that  the  remaining  seven  clumps  (A,  B,  C,  E,  G,  J,  and  M)  represent  real  halo 
substructure  (they  account  for  about  50%  of  the  sample  in  the  right  wedge).  The  strongest  clump  in  the  left  wedge  belongs  to  the  Sgr  dwarf  tidal  stream,  as  does  clump  C  in 
the  right  wedge  (Ivezic  et  al.  2003).  An  approximate  location  and  properties  of  labeled  overdensities  are  listed  in  Table  3.  The  Ivezic  et  al.  (2000)  sample  is  based  on  only 
two  epochs  and  thus  has  a  much  lower  completeness  (~56%),  resulting  in  a  lower  density  contrast. 
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Fig.  14. — Fraction  of  spectroscopically  confirmed  unresolved  QSOs  (./qso» 
solid  line )  and  fraction  of  sources  from  the  stellar  locus  (fioc,  dashed  line )  brighter 
than  g  —  19.5  and  r  —  19.5  that  have  rms  scatter  larger  than  a  in  the  g  and  r  bands . 
The  ratio 7qso/(1  +„/i0c)  ( dotted  line),  which  corresponds  to  the  implied  fraction 
of  variable  QSOs,  peaks  at  a  level  of  90%  for  a  =  0.03  mag.  [See  the  electronic 
edition  of  the  Journal  for  a  color  version  of  this  figure .] 
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variable,  we  estimate  that  at  least  90%  of  quasars  are  variable 
at  the  0.03  mag  level  on  timescales  up  to  several  years. 

6.  IMPLICATIONS  FOR  SURVEYS  SUCH  AS  LSST 

The  LSST  is  a  proposed  imaging  survey  that  aims  to  obtain 
repeated  multiband  imaging  to  faint  limiting  magnitudes  over  a 
large  fraction  of  the  sky.  The  LSST  Science  Requirements  Doc¬ 
ument28  calls  for  ~1000  repeated  observations  of  a  solid  angle  of 
~20,000  deg2  distributed  over  the  six  ugrizY photometric  band- 
passes  and  over  10  yr.  The  results  presented  here  can  be  extrap¬ 
olated  to  estimate  the  lower  limit  on  the  number  of  variable 
sources  that  the  LSST  would  discover. 

The  single-epoch  LSST  images  will  have  a  5  a  detection  limit29 
at  r  ~  24.7.  Hence,  2%  accurate  photometry,  comparable  to  the 
subsample  with  g  <  20.5  discussed  here,  will  be  available  for 
stars  with  r  >  22.  The  USNO-B  catalog  (Monet  et  al.  2003) 
shows  that  there  are  about  1 09  stars  with  r  <  21  across  the  entire 
sky.  About  half  of  these  stars  are  in  the  parts  of  the  sky  to  be 
surveyed  by  the  LSST.  The  simulations  based  on  contemporary 
Milky  Way  models,  such  as  those  developed  by  Robin  et  al. 
(2003)  and  Juric  et  al.  (2007),  predict  that  there  are  about  twice  as 
many  stars  with  r  <22  than  with  r  <  21  across  the  whole  sky. 
Hence,  it  is  expected  that  the  LSST  will  detect  about  a  billion 
stars  with  r  <  22.  This  estimate  is  uncertain  to  within  a  factor  of 
2  or  so  due  to  unknown  details  in  the  spatial  distribution  of  dust 
in  the  Galactic  plane  and  toward  the  Galactic  center. 

We  found  that  at  least  0.5%  of  stars  from  the  main  stellar  locus 
can  be  detected  as  variable  with  photometry  accurate  to  ~2%.  This 
is  only  a  lower  limit  because  a  much  larger  number  of  LSST  ob¬ 
servations  obtained  over  a  longer  time  span  than  the  SDSS  data 
discussed  here  would  increase  this  fraction.  Ongoing  LSST 
simulations  suggest  an  increase  by  a  factor  of  1 0  for  stars  with 
r  <  24.5.  Hence,  our  results  imply  that  the  LSST  will  discover  at 
least  50  million  variable  stars  (without  accounting  for  the  fact 
that  stellar  counts  greatly  increase  closer  to  the  Galactic  plane). 
Unlike  the  SDSS  sample,  where  RR  Lyrae  stars  account  for 
~25%  of  all  variable  stars,  the  number  of  RR  Lyrae  stars  in  the 
LSST  sample  will  be  negligible  compared  to  other  types  of 
variable  stars. 

As  estimated  by  Juric  et  al.  (2007)  using  deeper  co-added  SDSS 
photometry,  there  are  about  100  deg~2  low-redshift  quasars  with 
r  <  22  (see  also  Beck-Winchatz  &  Anderson  2007  and  refer¬ 
ences  therein).  Therefore,  with  a  sky  coverage  of  ^20,000  deg2. 


28  Available  at  http://www.lsst.org/Science/lsst_baseline.shtral. 

29  The  LSST  Exposure  Time  Calculator  is  available  at  http://www.lsst.org. 


the  LSST  will  obtain  well-sampled,  accurate,  multicolor  light 
curves  for  ~2  million  low-redshift  quasars.  Even  at  the  redshift 
limit  of  ~2,  this  sample  will  be  complete  to  Mr  ~  —24,  that  is, 
almost  to  the  formal  quasar  luminosity  cutoff,  and  will  represent 
an  unprecedented  sample  for  studying  quasar  physics. 

7.  CONCLUSIONS  AND  DISCUSSION 

We  have  designed  and  tested  algorithms  for  selecting  candi¬ 
date  variable  sources  from  a  catalog  based  on  multiple  SDSS  im¬ 
aging  observations.  Using  a  sample  of  13,051  selected  candidate 
variable  sources  in  the  adopted  g  <  20.5  flux-limited  sample,  we 
find  that  at  least  2%  of  unresolved  optical  sources  at  high  Galactic 
latitudes  ( b  <  —20°)  appear  variable  at  the  >0.05  mag  level  si¬ 
multaneously  in  the  g  and  r  bands.  A  similar  fraction  of  vari¬ 
able  sources  (~1%)  was  also  found  by  Sesar  et  al.  (2006)  using 
recalibrated  photometric  POSS  and  SDSS  measurements  and  by 
Morales-Rueda  et  al.  (2006)  using  the  Faint  Sky  Variability  Survey 
data  (~1%). 

Thanks  to  the  multicolor  nature  of  the  SDSS  photometry,  and 
especially  the  //-band  data,  we  can  obtain  robust  classification  of 
selected  variable  sources.  The  majority  (2  out  of  3)  of  variable 
sources  are  low-redshift  (<2)  quasars,  although  they  represent 
only  2%  of  all  sources  in  the  adopted  g  <  20.5  flux-limited  sam¬ 
ple.  We  find  that  about  1  out  of  4  of  the  variable  stars  are  RR 
Lyrae  stars,  and  that  only  0.5%  of  the  stars  from  the  main  stellar 
locus  are  variable  at  the  0.05  mag  level. 

The  distribution  of  7 (g)  for  main  stellar  locus  stars  is  bi- 
modal,  suggesting  at  least  two,  and  perhaps  more,  different  pop¬ 
ulations  of  variables.  About  one-third  of  the  variable  stars  from 
the  stellar  locus  show  gray  flux  variations  in  the  g  and  r  bands 
[a(g)/cr(r)  ~  1]  and  positive  light-curve  skewness,  suggesting  var¬ 
iability  caused  by  eclipsing  systems.  This  population  has  an  in¬ 
creased  fraction  of  M-type  stars. 

RR  Lyrae  stars  show  the  largest  rms  scatter  in  the  u  and  g  bands, 
followed  by  low-redshift  quasars.  The  ratio  of  rms  scatter  in  the  g 
and  r  bands  for  RR  Lyrae  stars  is  ~1.4,  in  agreement  with  the 
Ivezic  et  al.  (2000)  results  based  on  two-epoch  photometry.  The 
mean  light-curve  skewness  for  RR  Lyrae  stars  is  approximately 
—0.5,  in  agreement  with  Wils  et  al.  (2006).  We  selected  a  sam¬ 
ple  of  634  candidate  RR  Lyrae  stars  with  an  estimated  >95% 
completeness  and  ^70%  efficiency.  Using  these  stars,  we  detected 
rich  halo  substructure  out  to  distances  of  100  kpc.  The  apparent 
“dumpiness”  of  the  candidate  RR  Lyrae  distribution  increases 
with  increasing  radius,  similar  to  CDM  predictions  by  Bullock 
etal.  (2001). 

Low-redshift  quasars  show  a  dependence  of  a(g)/cr(r)  on  red- 
shift,  consistent  with  discussions  in  Richards  et  al.  (2002)  and 
Wilhite  et  al.  (2005).  The  light-curve  skewness  distribution  for 
quasars  is  centered  on  zero  in  all  photometric  bands.  We  find  that 
at  least  90%  of  quasars  are  variable  at  the  0.03  mag  level  (rms)  on 
timescales  up  to  several  years.  This  confirms  that  variability  is  as 
a  good  a  method  for  finding  low-redshift  quasars  at  high  (|h|  > 
30°)  Galactic  latitudes  as  UV excess  color  selection.  The  fraction 
of  variable  quasars  at  the  >0.1  mag  level  obtained  here  (30%; 
see  Fig.  14)  is  comparable  to  36%  found  by  Rengstorf  et  al.  (2006). 

The  multiple  photometric  observations  obtained  by  the  SDSS 
represent  an  excellent  data  set  for  estimating  the  impact  of  sur¬ 
veys  such  as  the  LSST  on  studies  of  the  variable  sky.  Our  results 
indicate  that  the  LSST  will  obtain  well-sampled  2%  accurate 
multicolor  light  curves  for  ~2  million  low-redshift  quasars  and 
will  discover  at  least  50  million  variable  stars.  The  number  of 
variable  stars  discovered  by  the  LSST  will  be  of  the  same  order 
as  the  number  of  all  stars  detected  by  the  SDSS.  With  about  1000 
data  points  in  six  photometric  bands,  it  will  be  possible  to  recognize 
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and  classify  variable  objects  using  light-curve  moments  of  higher 
order  than  the  skewness  discussed  here,  including  light-curve  fold¬ 
ing  for  periodic  variables. 
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